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Abstract 

We present a new formalism for light beam optics starting with an 
exact eight-dimensional matrix representation of the Maxwell equa- 
tions. The Foldy-Wouthuysen iterative diagonalization technique is 
employed to obtain a Hamiltonian description for a system with vary- 
ing refractive index. Besides, reproducing all the traditional quasi- 
paraxial terms, this method leads to additional contributions, which 
are dependent on the wavelength, in the optical Hamiltonian. This 
alternate prescription to obtain the aberration expansion is applied 
to the axially symmetric graded index fiber. This results in the 
wavelength-dependent modifications of the paraxial behaviour and 
the aberration coefficients. Furthermore it predicts a wavelength- 
dependent image rotation. In the low wavelength limit our formal- 
ism reproduces the Lie algebraic formalism of optics. The Foldy- 
Wouthuysen technique employed by us is ideally suited for the Lie 
algebraic approach to optics. The present study further strength- 
ens the close analogy between the various prescription of light and 
charged-particle optics. All the associated machinery used in this for- 
malism is described in the text and the accompanying appendices. 
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1 Introduction 

The traditional scalar wave theory of optics (including aberrations to all or- 
ders) is based on the beam-optical Hamiltonian derived using the Fermat's 
principle. This approach is purely geometrical and works adequately in the 
scalar regime. The other approach is based on the Helmholtz equation which 
is derived from the Maxwell equations. In this approach one takes the square- 
root of the Helmholtz operator followed by an expansion of the radical [1, 2]. 
This approach works to all orders and the resulting expansion is no differ- 
ent from the one obtained using the geometrical approach of the Fermat's 
principle. 

Another way of obtaining the aberration expansion is based on the al- 
gebraic similarities between the Helmholtz equation and the Klein-Gordon 
equation. Exploiting this algebraic similarity the Helmholtz equation is lin- 
earized in a procedure very similar to the one due to Feschbach-Villars, for 
linearizing the Klein-Gordon equation. This brings the Helmholtz equation 
to a Dirac-like form and then follows the procedure of the Foldy-Wouthuysen 
expansion used in the Dirac electron theory. This approach, which uses the 
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algebraic machinery of quantum mechanics, was developed recently [3], pro- 
viding an alternative to the traditional square-root procedure. This scalar 
formalism gives rise to wavelength-dependent contributions modifying the 
aberration coefficients [4] . The algebraic machinery of this formalism is very 
similar to the one used in the quantum theory of charged-particle beam optics, 
based on the Dirac [5]- [7] and the Klein-Gordon [8] equations respectively. 
The detailed account for both of these is available in [9]. A treatment of 
beam optics taking into account the anomalous magnetic moment is avail- 
able in [10]- [13]. 

As for the polarization: A systematic procedure for the passage from 
scalar to vector wave optics to handle paraxial beam propagation problems, 
completely taking into account the way in which the Maxwell equations cou- 
ple the spatial variation and polarization of light waves, has been formulated 
by analysing the basic Poincare invariance of the system, and this procedure 
has been successfully used to clarify several issues in Maxwell optics [14]- [17]. 

In all the above approaches, the beam-optics and the polarization are 
studied separately, using very different machineries. The derivation of the 
Helmholtz equation from the Maxwell equations is an approximation as one 
neglects the spatial and temporal derivatives of the permittivity and perme- 
ability of the medium. Any prescription based on the Helmholtz equation is 
bound to be an approximation, irrespective of how good it may be in cer- 
tain situations. It is very natural to look for a prescription based fully on 
the Maxwell equations. Such a prescription is sure to provide a deeper un- 
derstanding of beam-optics and polarization in a unified manner. With this 
as the chief motivation we construct a formalism starting with the Maxwell 
equations in a matrix form: a single entity containing all the four Maxwell 
equations. 

In our approach we require an exact matrix representation of the Maxwell 
equations in a medium taking into account the spatial and temporal varia- 
tions of the permittivity and permeability. It is necessary and sufficient to use 
8x8 matrices for such an exact representation. This representation makes 
use of the Riemann-Silberstein vector, which is described in Appendix-A. 
The derivation of the required matrix representation, and how it differs from 
the numerous other ones is presented in Appendix-B. 

The derived matrix representation of the Maxwell equations has a very 
close algebraic correspondence with the Dirac equation. This enables us to 
apply the machinery of the Foldy-Wouthuysen expansion used in the Dirac 
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electron theory. The Foldy-Wouthuysen transformation technique is outlined 
in Appendix-C. General expressions for the Hamiltonians are derived without 
assuming any specific form for the refractive index. These Hamiltonians are 
shown to contain the extra wavelength-dependent contributions which arise 
very naturally in our approach. In Section- IV we apply the general formalism 
to the specific examples: A. Medium with Constant Refractive Index. This 
example is essentially for illustrating some of the details of the machinery 
used. 

The other application, B. Axially Symmetric Graded Index Medium is 
used to demonstrate the power of the formalism. Two points are worth 
mentioning, Image Rotation: Our formalism gives rise to the image rotation 
(proportional to the wavelength) and we have derived an explicit relationship 
for the angle of the image rotation. The other pertains to the aberrations: In 
our formalism we get all the nine aberrations permitted by the axial symme- 
try. The traditional approaches give six aberrations. Our formalism modifies 
these six aberration coefficients by wavelength-dependent contributions and 
also gives rise to the remaining three permitted by the axial symmetry. The 
existence of the nine aberrations and image rotation are well-known in axi- 
ally symmetric magnetic lenses, even when treated classically. The quantum 
treatment of the same system leads to the wavelength-dependent modifica- 
tions [9]. The alternate procedure for the Helmholtz optics in [3, 4] gives the 
usual six aberrations (though modified by the wavelength-dependent contri- 
butions) and does not give any image rotation. These extra aberrations and 
the image rotation are the exclusive outcome of the fact that the formalism 
is based on a treatment starting with an exact matrix representation of the 
Maxwell equations. 

The traditional beam-optics is completely obtained from our approach 
in the limit wavelength, A — > 0, which we call as the traditional limit of 
our formalism. This is analogous to the classical limit obtained by taking 
H — > in the quantum prescriptions. The scheme of using the Foldy- 
Wouthuysen machinery in this formalism is very similar to the one used 
in the quantum theory of charged-particle beam optics [5]- [13]. There too 
one recovers the classical prescriptions (Lie algebraic formalism of charged- 
particle beam optics, to be precise) in the limit ^ — > 0, where 1\ — h/Vo 
is the de Broglie wavelength and p is the design momentum of the system 
under study. 

In this article we focus on the Hamiltonian description of the beam optics, 
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as is customary in the traditional prescriptions of beam optics. This also 
enables us to relate our formalism with the traditional prescriptions. The 
studies on the evolution of the fields and the polarization are very much in 
progress. Some of the results in [17] have been obtained as the lowest order 
approximation of the more general framework developed here. These shall 
be presented elsewhere [18]. 

2 Traditional Prescriptions 

Recalling, that in the traditional scalar wave theory for treating monochro- 
matic quasiparaxial light beam propagating along the positive z-axis, the z- 
evolution of the optical wave function ip(r) is taken to obey the Schrodinger- 
like equation 

i* J^(r) = Hj>(r) , (1) 
where the optical Hamiltonian H is formally given by the radical 

H = -(n\r)-pl) 1 '\ (2) 

and n(r) = n(x, y, z) is the varying refractive index. In beam optics the rays 
are assumed to propagate almost parallel to the optic-axis, chosen to be z- 
axis, here. That is, |p_J <C 1. The refractive index is the order of unity. For 
a medium with uniform refractive index, n(r) = n and the Taylor expansion 
of the radical is 




In the above expansion one retains terms to any desired degree of accuracy 
in powers of In general the refractive index is not a constant and 

varies. The variation of the refractive index n(r), is expressed as a Taylor 
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expansion in the spatial variables x, y with dependent coefficients. To get 
the beam optical Hamiltonian one makes the expansion of the radical as 
before, and retains terms to the desired order of accuracy in (^rP^) along 
with all the other terms (coming from the expansion of the refractive index 
n(r)) in the phase-space components up to the same order. In this expansion 
procedure the problem is partitioned into paraxial behaviour + aberrations, 
order-by-order. 

In relativistic quantum mechanics too, one has the problem of understand- 
ing the behaviour in terms of nonrelativistic limit + relativistic corrections, 
order-by-order. In the Dirac theory of the electron this is done most conve- 
niently through the Foldy-Wouthuysen transformation [19, 20]. The beam 
optical Hamiltonian derived, starting with the exact matrix representation of 
the Maxwell equations has a very close algebraic resemblance with the Dirac 
case, accompanied by the analogous physical interpretations. The details of 
this correspondence and the Foldy-Wouthuysen transformation are given in 
Appendix-C. 



3 The Beam-Optical Formalism 

Matrix representations of the Maxwell equations are very well-known [21]- 
[23]. However, all these representations lack an exactness or/and are given 
in terms of a pair of matrix equations. A treatment expressing the Maxwell 
equations in a single matrix equation instead of a pair of matrix equations 
was obtained recently [24]-[26]. This representation contains all the four 
Maxwell equations in presence of sources taking into account the spatial and 
temporal variations of the permittivity e(r,t) and the permeability /i(r,t). 
Maxwell equations [27, 28] in an inhomogeneous medium with sources are 

V-D(r,t)=p, 

VxH(r,t)-^D(r,t) = J, 

VxE(r,t) + ^B(r,t)=0, 

V B(r,t) = 0. (4) 

We assume the media to be linear, that is D = e(r,t)E, and B = fi(r,t)H, 
where e is the permittivity of the medium and \x is the permeability 
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of the medium. The magnitude of the velocity of light in the medium is 
given by v(r, t) 



\v(r,t)\ = 1/y e(r, t)p(r, t). In vacuum we have, eo = 
8.85 x 10- 12 C 2 /N.m 2 and /i = 4vr x 10~ 7 N/A 2 . Following the notation 
in [23, 24] we use the Riemann-Silberstein vector given by 



'e(r,t)E(r,t)±i 



B(r,t) 



(5) 



We further define, 



-F ± ± iF ± 

± x y 

F ± 

z 

F ± ± iF ± 



w 



± 



J x i iJy 

Jz - Vp 
Jz + Vp 

J T X i i J T y 



(6) 



where W ± are the vectors for the sources. Following the notation in [24] the 
exact matrix representation of the Maxwell equations is 



d_ 
di 



I 
I 







v(r,t) 


I 













2v(r,t) 





I 






h(r,t) 


ij3a y 






2h(r,t) 


i/3a y 










-v(r,t) 



{M ■ V + S • u} 



-i(3 (£* 



w)a. 



—i(3 (S • w ) a y 
{M* • V + S* ■ 



J 




' w + ' 


/ 




w~ 



(7) 



where 
matrices are 



* denotes complex-conjugation, v — % and h — The various 



M x = 





il 



-ii 
o 



M, = /3 = 



11 








11 
11 

a 
a 

and 11 is the 2x2 unit matrix. The triplet of the Pauli matrices, er are 





' <j ' 




' t 


" 


a = 


a 


1 = 





11 
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1 " 




' -i " 




' 1 


" 




o~ x = 


1 





, <Ty = 


i 







-1 





(8) 



(9) 
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and 

u(r,t) 
w(r, t) 
Lastly, 



^L-Vt'tr,*) = ±V{lnt>(r,()} = -iv{lnn(r,«)} 
^V*(r,«) = iv{ln*(r,t)}. 



Velocity Function : v(r, t) = 



(10) 



y/e(r,t)n(r,t) 



Resistance Function : h(r, t) 



\e(r,t)- 



(11) 



As we shall soon see, it is advantageous to use the above derived functions in- 
stead of the permittivity, e(r, t) and the permeability, fi(r, t). The functions, 
v(r, t) and h(r, t) have the dimensions of velocity and resistance respectively. 

Let us consider the case without any sources (W ± = 0). We further 
assume, 



with v(r,t) = and h(r,t) = 0. Then, 



uj > , 



(12) 



" M z 
M z 


d 

dz 




v(r) 







—v(r) 



{M ± ■ V± + £ • u} 

' -i(3{Y?-w)a y 



—if3 (S • w) a y 
- {M* ± • V ± + S* • u} 



1>~ 



(13) 



At this stage we introduce the process of wavization, through the familiar 
Schrodinger replacement 



-i*V 



d_ 

' dz 



Pz 



(14) 
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where X = Xjln is the reduced wavelength, c = Xuo and n(r) = c/v(r) is 
the refractive index of the medium. Noting, that (pq — qp) = —iX, which 
is very similar to the commutation relation, (pq — qp) = —ih, in quantum 
mechanics. In our formalism, 'X' plays the same role which is played by the 
Planck constant, l W in quantum mechanics. The traditional beam-optics is 
completely obtained from our formalism in the limit X — > 0. 

Noting, that M~ l = M z = /3, we multiply both sides of equation (13) by 



" M z 





-l 


' P 


" 





M z 










and (iX) , then, we obtain 



iX 



d_ 

dz 



tp+(r ± ,z) 



tp + (r ± ,z) 



(15) 



(16) 



This is the basic optical equation, where 



H„ 



En = 



-no 



f3 
-p 



+ E g + 9 



(n (r) - no) 
P{M ± 



p 

o p 



p 9 



+ 



p ± — iXE ■ u} 








P {M* ± ■ p ± - iXE* ■ u} 



a = 







-X (S ■ w ) a y 




— X (S* • w) a y 

where '</' stands for grand, signifying the eight dimensions and 



I 






—I 



(17) 



(18) 



The above optical Hamiltonian is exact (as exact as the Maxwell equations 
in a time-independent linear media). The approximations are made only at 
the time of doing specific calculations. Apart from the exactness, the optical 
Hamiltonian is in complete algebraic correspondence with the Dirac equation 
with appropriate physical interpretations. The relevant point is: 



PgEg — EgP„ , 



p g o g = -o g p t 



grg ■ 



(19) 
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We note that the upper component is coupled to the lower component 

through the logarithmic divergence of the resistance function. If this 
coupling function, w = 0, or is approximated to be zero, then the eight 
dimensional equations for ijp + ) and get completely decoupled, leading 
to two independent four dimensional equations. Each of these two equations 
is equivalent to the other. These are the leading equations for our studies of 
beam-optics and polarization. In the optics context any contribution from the 
gradient of the resistance function can be assumed to be negligible. With this 
reasonable assumption we can decouple the equations and reduce the problem 
from eight dimensions to four dimensions. In the following sections we shall 
present a formalism with the approximation w 0. After constructing the 
formalism in four dimensions we shall also address the question of dealing 
with the contributions coming from the gradient of the resistance function. 
This will require the application of the Foldy-Wouthuysen transformation 
technique in cascade as we shall see. This justifies the usage of the two derived 
laboratory functions in place of permittivity and permeability respectively. 
We drop the t+ ' throughout, then the beam-optical Hamiltonian is 



If we were to neglect the derivatives of the permittivity and permeability, we 
would have missed the term, (— i:X/3E • u). This is an outcome of the exact 
treatment. 

Proceeding with our analogy with the Dirac equation: this extra term 
is analogous to the anomalous magnetic/electric moment term coupled to 
the magnetic/electric field respectively in the Dirac equation. The term we 
dropped (while going from the eight dimensional exact to the four dimen- 
sional almost-exact) is analogous to the anomalous magnetic/electric moment 
term coupled to the electric/magnetic fields respectively. However it should 
be borne in mind that in our exact treatment, both the terms were derived 
from the Maxwell equations, where as in the Dirac theory the anomalous 
terms are added based on experimental results (some even predating the 




(r) 

-no/5 + £ + 6 

- (n (r) - no) /3- iA/?S • u 

i (M y p x - M x p y ) 



H 

£ 

6 




(20) 
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Dirac equation) and certain arguments of invariances. In our exact treat- 
ment of the Maxwell optics, these are the only two terms one gets, where 
as in the Dirac equation the scheme of invariances permits addition of any 
number of terms! The term, (—iX/3'S ■ u) is related to the polarization and 
we shall call it as the polarization term. 

One of the other similarities worth noting, relates to the square of the 
optical Hamiltonian. 

H 2 = [n 2 (r) -pi} -*V+ [M ± -p ± ,n(r)} 
+2iAn(r)£ • u + iX [M± ■ p ± , S • u] 
= {n (r) + i^S • u} 2 - p\ 

+ [M ± -p ± ,{n (r) + iAS • u}] , (21) 

where, [A, B] = (AB — BA) is the commutator. It is to be noted that the 
square of the Hamiltonian in our formalism differs from the square of the 
Hamiltonian in the square- root approaches [1, 2] and the scalar approach 
in [3, 4]. This is essentially the same type of difference which exists in the 
Dirac case. There too, the square of the Dirac Hamiltonian gives rise to 
extra pieces (such as, —TiqYi ■ B, the Pauli term which couples the spin 
to the magnetic field) which is absent in the Schrodinger and the Klein- 
Gordon descriptions. It is this difference in the square of the Hamiltonians 
which give rise to the various extra wavelength-dependent contributions in 
our formalism. These differences persist even in the approximation when the 
polarization term is neglected. 

The beam optical Hamiltonian derived in (20) has a very close alge- 
braic correspondence with the Dirac equation, accompanied by the analo- 
gous physical interpretations. This enables us to employ the machinery of 
the Foldy-Wouthuysen transformation technique. The details are available in 
Appendix-C. To the leading order, that is to order, (^p 2 ^ the beam-optical 

Hamiltonian in terms of £ and O is formally given by 

7?( 2 ) = - no p + £ - ^-(3d 2 . (22) 

2n 

Note that, O 2 = —p\ and £ = — (n (r) — n ) (3 — lX/3£ • u. Since, we are 
primarily interested in the forward propagation, we drop the f3 from the 
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non-matrix parts of the Hamiltonian. The matrix terms are related to the 
polarization. The formal Hamiltonian in (22), expressed in terms of the 
phase-space variables is: 



H (2) = -\n{r) - ^-pl) -iXpX-u 
I 2nn J 



(23) 



Note that one retains terms up to quadratic in the Taylor expansion of the 
refractive index n(r), to be consistent with the order of (-tPjJ- This is 
the paraxial Hamiltonian which also contains an extra matrix dependent 
term, which we call as the polarization term. Rest of it is similar to the one 
obtained in the traditional approaches. 

To go beyond the paraxial approximation one goes a step further in the 
Foldy-Wouthuysen iterative procedure. Note that, O is the order of p ± . To 

order , the beam-optical Hamiltonian in terms of S and O is formally 

given by 



7^(4) = 



-noP + E 



2n 



[30 2 



8n% 



d 



0,\[0,£\ 



8n 3 n 




(24) 



Note that O 4 = p^_, and -j^O = 0. The formal Hamiltonian in (24) when 
expressed in terms of the phase-space variables is 



77(4) = 



n r 



8n 2 



1-2 1-4 



P± , (n(r) - n ) 



+ 2 (p x (n(r) - n ) p x + p y (n(r) - n ) p y )} 
^2 { [Px , [Py , ( n ( r ) ~ n o)]+] ~ [Py , [Px , (n(r) - n )]+] } 
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+ 8nf ^ Px ' ~ n °^ + + ^ ' ~ n °^ + } 
+ i { [\Px > ("(»*) - n o)]+ , [p y , (n(r) - n )]+] } 

(25) 

where [A, B] + — (AB + B A) and '• • •' are the contributions arising from the 
presence of the polarization term. Any further simplification would require 
information about the refractive index n(r). 

Note that, the paraxial Hamiltonian (23) and the leading order aberration 
Hamiltonian (25) differs from the ones derived in the traditional approaches. 
These differences arise by the presence of the wavelength-dependent contri- 
butions which occur in two guises. One set occurs totally independent of the 
polarization term in the basic Hamiltonian. This set is a multiple of the unit 
matrix or at most the matrix (3. The other set involves the contributions 
coming from the polarization term in the starting optical Hamiltonian. This 
gives rise to both matrix contributions and the non-matrix contributions, 
as the squares of the polarization matrices is unity. We shall discuss the 
contributions of the polarization to the beam optics elsewhere. Here, it suf- 
fices to note existence of the the wavelength-dependent contributions in two 
distinguishable guises, which are not present in the traditional prescriptions. 



3.1 When w ^ 

In the previous sections we assumed, w = 0, and this enabled us to develop 
a formalism using 4x4 matrices via the Foldy-Wouthuysen machinery. The 
Foldy-Wouthuysen transformation enables us to eliminate the odd part in 
the 4x4 matrices, to any desired order of accuracy. Here too we have the 
identical problem, but a step higher in dimensions. So, we need to apply the 
Foldy-Wouthuysen to reduce the strength of the odd part in eight dimensions. 
This will reduce the problem from eight to four dimensions. 

We start with the grand beam optical equation in (16) and proceed with 
the Foldy-Wouthuysen transformations as before, but with each quantity in 
double the number of dimensions. Symbolically this means: 



H 



H 



9 ■ 



i> — ► i'g 



rj,- 



14 



£ — £ g , — > O g 
" /3 

n — > n g = n Q _^ 
The first Foldy-Wouthuysen iteration gives 



(26) 



«f = 



-no 



-n 



(3 






~P 
(3 
p 



[3 g + £ g 



1 

2n ' 
1 



+ £ 9 -—P 9 o 2 9 



2n 



-~X w ■ w 













-P 



0, 



(27) 



We drop the (3 g , as before and then get the following 



: H^(r) 
H = -n /3 + £ + O 



- (n (r) -no)P- iX/3Y, ■ u + wide- — 

2no 



i (M y p x - M x p y ) 
P(M ± -p ± ) , 



(28) 



where, w 2 = w ■ w, the square of the logarithmic gradient of the resistance 
function. This is how the basic beam optical Hamiltonian (20) gets modified. 
The next degree of accuracy is achieved by going a step further in the Foldy- 
Wouthuysen iteration and obtaining the Ti,^ '. Then, this would be the higher 
refined starting beam optical Hamiltonian, further modifying the basic beam 
optical Hamiltonian (20). This way, we can apply the Foldy-Wouthuysen in 
cascade to obtain the higher order contributions coming from the logarithmic 
gradient of the resistance function, to any desired degree of accuracy. We are 
very unlikely to need any of these contributions, but it is very much possible 
to keep track of them. 



4 Applications 

In the previous sections we presented the exact matrix representation of the 
Maxwell equations in a medium with varying permittivity and permeability 
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following the recipe in [24]. From this we derived an exact optical Hamil- 
tonian, which was shown to be in close algebraic analogy with the Dirac 
equation. This enabled us to apply the machinery of the Foldy-Wouthuysen 
transformation and we obtained an expansion for the beam-optical Hamil- 
tonian which works to all orders. Formal expressions were obtained for the 
paraxial Hamiltonian and the leading order aberrating Hamiltonian, with- 
out assuming any form for the refractive index. Even at the paraxial level 
the wavelength-dependent effects manifest by the presence of a matrix term 
coupled to the logarithmic gradient of the refractive index. This matrix 
term is very similar to the spin term in the Dirac equation and we call it as 
the polarizing term in our formalism. The aberrating Hamiltonian contains 
numerous wavelength-dependent terms in two guises: One of these is the 
explicit wavelength-dependent terms coming from the commutators inbuilt 
in the formalism with ~X playing the role played by Ti in quantum mechanics. 
The other set arises from the the polarizing term. 

Now, we apply the formalism to specific examples. One is the medium 
with constant refractive index. This is perhaps the only problem which can 
be solved exactly in a closed form expression. This is just to illustrate how 
the aberration expansion in our formalism can be summed to give the familiar 
exact result. 

The next example is that of the axially symmetric graded index medium. 
This example enables us to demonstrate the power of the formalism, repro- 
ducing the familiar results from the traditional approaches and further giving 
rise to new results, dependent on the wavelength. 

4.1 Medium with Constant Refractive Index 

Constant refractive index is the simplest possible system. In our formalism, 
this is perhaps the only case where it is possible to do an exact diagonal- 
ization. This is very similar to the exact diagonalization of the free Dirac 
Hamiltonian. From the experience of the Dirac theory we know that there 
are hardly any situations where one can do the exact diagonalization. One 
necessarily has to resort to some approximate diagonalization procedure. 
The Foldy-Wouthuysen transformation scheme provides the most convenient 
and accurate diagonalization to any desired degree of accuracy. So we have 
adopted the Foldy-Wouthuysen scheme in our formalism. 
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For a medium with constant refractive index, n (r) = n c , we have, 
H c = -n c (3 + i (M y p x - M x p y ) , 
which is exactly diagonalized by the following transform, 



cxp 



i (±i/3) 09 



exp [Ti(3 (M y p x - M x p y ) 9} 

cosh ( pj_ 6*) =F i — smh ( p ± 0) . 

|P±I 



We choose, 



tanh(2 \p ± \6) 



\P±\ 



then 



(w c + P„) T i/3 (M y p x - M xVy ) 
pP z (n c + P z ) 



where P z = +y (ra|! — p 2 ^j. Then we obtain, 

^diagonal T~ 

= T + {-n c P + i(M yPx -M x py)}T- 

We next, compare the exact result thus obtained with the approximate 
obtained through the systematic series procedure we have developed. 



L 1 ^2 1 ~4 



^diagonal 
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Knowing the Hamiltonian, we can compute the transfer maps. The trans- 
fer operator between any pair of points {(z",z') \z" > z'} on the z-axis, is 
formally given by 



with 



{z\z')\=f{z\z>) mz",z')) , 



^?{z", z>) = Hf(z", z') , f(z", z')=X, 



(35) 



T(z",z') = pjexp 

i r" — 
= J - - / dzH{ 

A J z' 



dzH(z) 



+... , 



(36) 



where X is the identity operator and p denotes the path-ordered exponential. 
There is no closed form expression for T(z", z') for an arbitrary choice of the 
refractive index n(r). In such a situation the most convenient form of the 
expression for the ^-evolution operator T(z",z'), or the z-propagator, is 



with 



T(z", z') = exp 



f(z",z') = I" dzH(z) 

J z' 

1 ( i 

+ 2ll 
+ ... , 



-^T{z",z>) 



(37) 



dz I dz' 



H(z),H(z') 



(38) 



as given by the Magnus formula [29], which is described in Appendix-D. We 
shall be needing these expressions in the next example where the refractive 
index is not a constant. 
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Using the procedure outlined above we compute the transfer operator, 



U c (z out ,z in ) = exp 



cxp 



--AzH c 



ipl i fpY 



(39) 



where, Az = (z out , z in ). Using (39), we compute the transfer maps 



(r±> 
(P±> 



out 



o 1 



Az 



(r±) 
(P±> 



(40) 



The beam-optical Hamiltonian is intrinsically aberrating. Even for the sim- 
plest situation of a constant refractive index, we have aberrations to all or- 
ders. 



4.2 Axially Symmetric Graded Index Medium 

We just saw the treatment of the medium with a constant refractive index. 
This is perhaps the only problem which can be solved exactly in a closed 
form expression. This was just to illustrate how the aberration expansion 
in our formalism can be obtained. We now consider the next example. The 
refractive index of an axially symmetric graded-index material can be most 
generally described by the following polynomial (see, pp. 117 in [1]) 

n (r) = n + a 2 (z)r 2 L + a 4 (z)r\ H , (41) 

where, we have assumed the axis of symmetry to coincide with the optic-axis, 
namely the z-axis without any loss of generality. We note, 

£ = - {a 2 (z)r 2 ± + a 4 (z)r\ + ■ ■ ■ , } /3 - i*/3E • it 

6 = i (M y p x — M x p y ) 

= P(M ± .p ± ) (42) 



where 



S • u = -—a 2 {z)H L ■ r ± - { ^a 2 (z) ) Y, z r\ (13) 

?7-o ZUq 
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To simplify the formal expression for the beam-optical Hamiltonian Ti.^ 
given in (24)- (25), we make use of the following: 

d 



(M ± -p ± y 



pi 
i 



o z = -pi , 



dz 



O = 0, 



(M ± - P± )r{(M ± -p ± ) = -(ripi + piri)+2*(3L z + 2*, (44) 

where, L z is the angular momentum. Finally, the beam-optical Hamiltonian 
to order (jpp\^ is 

T~t = H 0jP + #0,(4) 

~T~ n 0,(2) ""0,(4) 



iZi 



0,p 



if. 



0,(4) 



(A) 
0,(2) 



'0,(4) 



+H 

-no + ^-Pl - a 2 {z)r\ 

1 ~4 

-a 4 (z)r\ 

-h a2{z) ~^ a2{z)Zz+ h al{z)r 

iA 3 f d 

2nl \ dz 
UK 2 



a 2 2 (z) (r\L z + L z r 2 A + —^a 2 {z)a A (z)r 
a 2 {z) \ f3Y, z 



a 2 (z) (H x p y - E y p x 



i} 3 f d 



—a 2 (z) } T, Z L Z 



2np [dz 

wide + —^a 2 {z)(3 



Aril 
iX { d 



r±,P± 



8uq \ dz 



a 2 (z) } /3S 2 



+ 
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(45) 



where [A, B]+ — (AB + BA) and '• • •' are the numerous other terms arising 
from the polarization term. We have retained only the leading order of 
such terms above for an illustration. All these matrix terms, related to the 
polarization will be addressed elsewhere. 

The reasons for partitioning the beam-optical Hamiltonian 7i in the above 
manner are as follows. The paraxial Hamiltonian, H pi describes the ideal 
behaviour. #0,(4) is responsible for the third-order aberrations. Both of these 
Hamiltonians are modified by the wavelength-dependent contributions given 

in -£^P(2) anc ^ ^o^(4) res P ec tively. Lastly, we have , which is associated 

with the polarization and shall be examined elsewhere. 

4.2.1 Image Rotation 

From these sub-Hamiltonians we make several observations: 

The term tA-o^ (z)L z which contributes to the paraxial Hamiltonian, gives 

rise to an image rotation by an angle 9(z): 

e(z\z') = ^ [ Z dza 2 (z). (46) 

ZTig J z' 

This image rotation (which need not be small) has no analogue in the 
square-root approach [1, 2] and the scalar approach [3, 4]. 

4.2.2 Aberrations 

The Hamiltonian i?o,(4) is the one we have in the traditional prescriptions 

and is responsible for the six aberrations. H^)^ modifies the above six aber- 
rations by wavelength-dependent contributions and further gives rise to the 
remaining three aberrations permitted by the axial symmetry. Before pro- 
ceeding further we enumerate all the nine aberrations permitted by the axial 
symmetry. The axial symmetry permits exactly nine third-order aberrations 
which are: 
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Symbol 

C 

K 

k 

A 

a 

F 

D 

d 

E 



Polynomial 
Pi 

Pi , (P± -r± + r ± - p ± ) 
P 2 ±L Z 

(P±- r± + r ± -p ± ) 2 ^ 
(P± -r± + r ± - p ± ) L z 
(plr\ + rlpl) 

(p±-r ± + r ± -p ± Y 



+ 



L' _L > 
™4 



+ 



Name 

Spherical Aberration 
Coma 

Anisotropic Coma 
Astigmatism 
Anisotropic Astigmatism 
Curvature of Field 
Distortion 

Anisotropic Distortion 
Nameless? or POCUS 



The name POCUS is used in [1] on page 137. 

The axial symmetry allows only the terms (in the Hamiltonian) which 
are produced out of, p 2 ± , r 2 ± , (p ± ■ r ± + r± • p ± ) and L z . Combinatorially, 
to fourth-order one would get ten terms including L\. We have listed nine of 
them in the table above. The tenth one namely, 



L 2 Z = \ (Piri + r\pi) -j{p ± -r ± + r ± - P± f + ^ 2 



(47) 



So, L z is not listed separately. Hence, we have only nine third-order aberra- 
tions permitted by axial symmetry, as stated earlier. 
The paraxial transfer maps are given by 




out 




(48) 



where P, Q, R and S are the solutions of the paraxial Hamiltonian (45). The 
symplecticity condition tells us that PS — QR — 1. In this particular case 
from the structure of the paraxial equations, we can further conclude that: 
R = P' and S = Q' where ( )' denotes the derivative. 

The transfer operator is most accurately and neatly expressed in terms 
of the paraxial solutions, P, Q, R and S, via the interaction picture of the 
Lie algebraic formulation of light beam optics and charged-particle beam 
optics [30]. 



T (z , z ) = exp 



■jT(z,zo) 
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exp 



"-i{c pi 

(z" , z') [pi , (p ± -r ± + r ± - p ± ) 

+k(z",z')p 2 ± L z 
+A (z" , z') (p ± -r ± + r ± - p ± ) 2 
+a (z" , z') (p ± ■r_ L + r ± - p ± ) L z 
+F(z",z')(plrl + rlpl) 

+D (z" , z') [r\ , (p ± -r ± + r ± - p ± ) 

+d (z" , z>) r\L z 

+E(z",z')r]_} 



The nine aberration coefficients are given by, 

1 „ A a 2 (z) 



C (z" , z>) 



dz 



2rag 



2nl 



a 2 (z)a 4 (z)Q 4 



K (z" , z') 



k (z" , z') 
A (z" , z') 



I dz 



8n 3 



RS S - 



Q 2 (-g) 
Anl 



QS(PS + QR) - a 4 (z)PQ" 



2nl 



a 2 {z)a 4 {z)PQ z 



dz 



a (z" , z') 
F(z\z') 



2nl 

s: 

2~^j z > 

r" 



dza 2 (z)Q 2 
1 „o~o a 2 (z 



r2s2 _ ^K^ pQRS _ a ^ z) p2 Q 2 



2ni 



■^a 2 (z)a 4 (z)P 2 Q 2 



/ dzaj(z)PQ 

J z' 



I R 2 S 2_^z)_^ p 2 S 2 + Q 2 R 2 ) _ a ^ z)p 2 Q 



Snf 



o 
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D (z" , z') 



d (z" , z') 
E (z" , z>) 



Thus we see that the current approach gives rise to all the nine permissible 
aberrations. The six aberrations, familiar from the traditional prescriptions 
get modified by the wavelength-dependent contributions. The extra three (k, 
a and d, all anisotropic!) are all pure wavelength-dependent aberrations and 
totally absent in the traditional square-root approach [1, 2] and the recently 
developed scalar approach [3, 4]. A detailed account on the classification of 
aberrations is available in [31]- [34]. 

5 Polarization 

Let there be Light! (with or/and without polarization) [18]. 

6 Concluding Remarks 

We have developed an exact matrix representation of the Maxwell equations 
taking into account the spatial and temporal variations of the permittivity 
and permeability. This representation, using 8x8 matrices is the basis for an 
exact formalism of Maxwell optics presented here. The exact beam optical 
Hamiltonian, derived from this representation has an algebraic structure in 
direct correspondence with the Dirac equation of the electron. We exploit 
this correspondence to adopt the standard machinery, namely the Foldy- 
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+^a 2 {z)a 4 {z)P 2 Q 2 



= / 2 dz \-^R 3 S - ^^-PR(PS + QR) - a±(z)P 3 Q 
Jz' I oHq 4uq 

t 2 



+^a 2 (z)a 4 (z)P 3 Q 



f Z dz \ JL^4 _ p2 R 2 _ ( )p 4 

Jz' \8n 3 2nl 4V ; 



1 rA a 2 (z 
^—a 2 (z)a 4 (z)P 4 \ . (50) 



Wouthuysen transformation technique of the Dirac theory, to the beam op- 
tical formalism. This enabled us to obtain a systematic procedure to obtain 
the aberration expansion from the beam-optical Hamiltonian to any desired 
degree of accuracy. We further get the wavelength-dependent contributions 
at each order, starting with the lowest-order paraxial paraxial Hamiltonian. 
Formal expressions were obtained for the paraxial and leading order aberrat- 
ing Hamiltonians, without making any assumption on the form of the varying 
refractive index. 

The beam-optical Hamiltonians also have the wavelength-dependent ma- 
trix terms which are associated with the polarization. In this approach we 
have been able to derive a Hamiltonian which contains both the beam-optics 
and the polarization. 

In Section-IV, we applied the formalism to the specific examples and saw 
how the beam-optics (paraxial behaviour and the aberrations) gets modified 
by the wavelength-dependent contributions. First of the two examples is the 
medium with a constant refractive index. This is perhaps the only problem 
which can be solved exactly, in a closed form expression. This example is 
primarily for illustrating certain aspects of the machinery we have used. 

The second, and the much more interesting example is that of the axially 
symmetric graded index medium. For this example, in the traditional ap- 
proaches one gets only six aberrations. In our formalism we get all the nine 
aberrations permitted by the axial symmetry. The six aberration coefficients 
of the traditional approaches get modified by the wavelength-dependent con- 
tributions. It is very interesting to note that apart from the wavelength- 
dependent modifications of the aberrations, this approach also gives rise to 
the image rotation. This image rotation is proportional to the wavelength 
and we have derived an explicit relationship for the angle in (46). Such, an 
image rotation has no analogue/counterpart in any of the traditional pre- 
scriptions. It would be worthwhile to experimentally look for the predicted 
image rotation. The existence of the nine aberrations and image rotation 
are well-known in axially symmetric magnetic electron lenses, even when 
treated classically. The quantum treatment of the same system leads to the 
wavelength-dependent modifications [9]. 

The optical Hamiltonian has two components: Beam- Optics and Polar- 
ization respectively. We have addressed the former in some detail and the 
later is in progress. The formalism initiated in this article provides a natural 
framework for the study of light polarization. This would provide a unified 
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treatment for the beam-optics and the polarization. It also promises a pos- 
sible generalization of the substitution result in [17]. We shall present this 
approach elsewhere [18]. 

The close analogy between geometrical optics and charged-particle beam 
optics has been known for too long a time. Until recently it was possible to 
see this analogy only between the geometrical optics and the classical pre- 
scriptions of charge-particle optics. A quantum theory of charged-particle 
optics was presented in recent years [5]- [10]. With the current development 
of the non-traditional prescriptions of Helmholtz optics [3, 4] and the ma- 
trix formulation of Maxwell optics presented here, using the rich algebraic 
machinery of quantum mechanics it is now possible to see a parallel of the 
analogy at each level. The non-traditional prescription of the Helmholtz op- 
tics is in close analogy with the quantum theory of charged-particles based on 
the Klein-Gordon equation. The matrix formulation of Maxwell optics pre- 
sented here is in close analogy with the quantum theory of charged-particles 
based on the Dirac equation [35]. The parallel of these analogies is described 
in Appendix-E. 

An important omission in the present study is the study of the evolution of 
the fields, (E ,B), which we shall address in detail elsewhere [18]. Even with- 
out the discussion of the fields (as is the case in several other prescriptions) 
the present study is complete at the Hamiltonian level. We have presented 
an alternate and exact way of deriving the beam optical Hamiltonian, which 
reproduces the established results. Furthermore we have derived the extra 
wavelength-dependent contributions. In the low wavelength limit our formal- 
ism reproduces the Lie algebraic formalism of optics. The Foldy-Wouthuysen 
technique employed by us is ideally suited for the Lie algebraic approach to 
optics. The present study further strengthens the close analogy between the 
various prescription of light and charged-particle beam optics [35] . 



26 



Appendix-A 
Riemann-Silberstein Vector 



The Riemann-Silberstein complex vector [23], F (r, t) built from the elec- 
tric field D (r, t) and the magnetic filed B (r, t) is given by 

F(r,t) = -Ll^^D(r,t) + i^±=B(r,t) 
V2 \J e (r) Jfi(r) 



= ^ \^)E (r, t) + i -j==B (r, t) ) , (A.l) 

where e(r) is the permittivity of the medium and //(r) is the perme- 
ability of the medium. In vacuum we have e = 8.85 x 10~ 12 C 2 / N.m 2 and 
fio = 4tt x iO~ 7 iV/A 2 . The Riemann-Silberstein complex vector, F (r,t) can 
also be derived from the potential Z (r,t), (for example, see [28]), 

F(r,t) = VxJ~Z(r,t) + VxZ(r,f)J. (A.2) 

Z (r,t) is the superpotential and is commonly known as the polarization 

potential or the i/erte Vector (see [28]). This further leads to the wave- 
equation 

{V 2 -^}ZM) = 0. (A.3) 

Riemann-Silberstein vector can be used to express many of the quantities 
associated with the electromagnetic field: 

Poynting Vector : S = —E x B 

H 

= —iv (F^ x F 



Energy Density :u = - I eE ■ E + -B ■ B 

2 V » j 
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And 



= F t ■ F 

Momentum Density : p EB = e(E x B) 

= --(f^xF) 

Angular Momentum Density : L E b = e{r x (E x B)} 

= _l{ r x (F f x F)} . (A.4) 



Total Energy : E = - I d 6 r leE ■ E + -B ■ B 



\!«{ 

= J d 3 r {F f • F} 
Total Momentum P = e J d 3 r {E x B} 

= -If* {* x F} 
Total Angular Momentum : M = e J d 3 r {r x (E x B)} 



= -- J d 3 r {r x (F f x f)} 
Moment of Energy : N = ^ J d 3 r |r ^eE ■ E + —B ■ B^j | 

= J d 3 r {r (F f • F)} (A.5) 



In this form these quantities look like the quantum-mechanical expectation 
values! The use of the Riemann-Silberstein vector as a possible candidate for 
the photon wavefunction has been advocated for a long time [23]. 
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Appendix-B 
An Exact Matrix Representation of the 
Maxwell Equations in a Medium 

Matrix representations of the Maxwell equations are very well-known [21]- 
[23]. However, all these representations lack an exactness or/and are given in 
terms of a pair of matrix equations. Some of these representations are in free 
space. Such a representation is an approximation in a medium with space- 
and time- dependent permittivity e(r, t) and permeability fi(r, t) respectively. 
Even this approximation is often expressed through a pair of equations using 
3x3 matrices: one for the curl and one for the divergence which occur in the 
Maxwell equations. This practice of writing the divergence condition sepa- 
rately is completely avoidable by using 4x4 matrices for Maxwell equations 
in free-space [21]. A single equation using 4x4 matrices is necessary and 
sufficient when e(r,t) and n(r,t) are treated as 'local' constants [21, 23]. 

A treatment taking into account the variations of e(r,t) and /u(r,t) has 
been presented in [23]. This treatment uses the Riemann-Silberstein vectors, 
F ± (r, t) to reexpress the Maxwell equations as four equations: two equations 
are for the curl and two are for the divergences and there is mixing in F + (r, t) 
and F (r,t). This mixing is very neatly expressed through the two derived 
functions of e(r,t) and fi(r,t). These four equations are then expressed as a 
pair of matrix equations using 6x6 matrices: again one for the curl and one 
for the divergence. Even though this treatment is exact it involves a pair of 
matrix equations. 

Here, we present a treatment which enables us to express the Maxwell 
equations in a single matrix equation instead of a pair of matrix equations. 
Our approach is a logical continuation of the treatment in [23]. We use the 
linear combination of the components of the Riemann-Silberstein vectors, 
F ± (r, t) and the final matrix representation is a single equation using 8x8 
matrices. This representation contains all the four Maxwell equations in 
presence of sources taking into account the spatial and temporal variations 
of the permittivity e(r,t) and the permeability fi(r,t). 

In Section-I we shall summarize the treatment for a homogeneous medium 
and introduce the required functions and notation. In Section-II we shall 



29 



present the matrix representation in an inhomogeneous medium, in presence 
of sources. 

B.l Homogeneous Medium 

We shall start with the Maxwell equations [27, 28] in an inhomogeneous 
medium with sources, 



We assume the media to be linear, that is D = eE, and B = /j,H, where 
e is the permittivity of the medium and /J, is the permeability of the 
medium. In general e = e(r, t) and fi = fi(r, t). In this section we treat them 
as 'local' constants in the various derivations. The magnitude of the velocity 



of light in the medium is given by v(r,t) = \v(r,t)\ = l/ye(r, t)/i(r, t). In 
vacuum we have, e = 8.85 x 1(T 12 C 2 / N.m 2 and /i = An x KT 7 'N '/ 'A 2 . 

One possible way to obtain the required matrix representation is to use 
the Riemann-Silberstein vector [23] given by 



For any homogeneous medium it is equivalent to use either F + (r, t) or 
F~ (r,t). The two differ by the sign before 'i' and are not the complex 
conjugate of one another. We have not assumed any form for E(r, t) and 
B(r,t). We will be needing both of them in an inhomogeneous medium, to 
be considered in detail in Section-II. 

If for a certain medium e(r, t) and /x(r, t) are constants (or can be treated 
as 'local' constants under certain approximations), then the vectors F^ 1 (r,t) 



V-D(r,t)=p, 
VxH(r,t)-^D(r,t) = J, 

Vx£(r,t) + ^£?(r,t)=0, 
V-B(r,t) = 0. 



(B.l) 





(B.2) 
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satisfy 



d 



i-^^t) = ±vVxF ± (r,t) 
1 



dt 

V-F ± (r,t) = 



= (iJ) 



= (P)- 



(B.3) 



Thus, by using the Riemann-Silberstein vector it has been possible to reex- 
press the four Maxwell equations (for a medium with constant e and /i) as 
two equations. The first one contains the the two Maxwell equations with 
curl and the second one contains the two Maxwell with divergences. The 
first of the two equations in (B.3) can be immediately converted into a 3 x 3 
matrix representation. However, this representation does not contain the di- 
vergence conditions (the first and the fourth Maxwell equations) contained 
in the second equation in (B.3). A further compactification is possible only 
by expressing the Maxwell equations in a 4 x 4 matrix representation. To 
this end, using the components of the Riemann-Silberstein vector, we define, 



-F+ + iF, + 
Ft 

F+ + iF y + 



The vectors for the sources are 

Jx + ^Jy 



w + = 



2e, 



Jz - Vp 
Jz+Vp 

Jx + iJy 



W~ = 



iF: 



iF~ 



Jx ^Jy 
Jz ~ Vp 
Jz+Vp 
J or It/?/ 



(B.4) 



(B.5) 



Then we obtain 



dt 

— ^ 

dt 



= -v{M 
= -v{M* 



w + 



w~ 



where ( )* denotes complex-conjugation and the triplet, M 
is expressed in terms of 



(B.6) 

(M x , M y , M z ) 



n = 



o 

i 







p 



I 









1 = 



1 





(B.7) 
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Alternately, we may use the matrix J = —Q. Both differ by a sign. For 
our purpose it is fine to use either Q or J. However, they have a different 
meaning: J is contravariant and Q is covariant; The matrix Q corresponds 
to the Lagrange brackets of classical mechanics and J corresponds to the 
Poisson brackets. An important relation is Q = J~ l . The M-matrices are: 



10 
1 



M„ 



1 









1 



-pn, 



M z = 




i 


1 








— 1 







-1 














in. 



= (3. 



(B.8) 



Each of the four Maxwell equations are easily obtained from the matrix 
representation in (B.6). This is done by taking the sums and differences of 
row-I with row-IV and row-II with row-Ill respectively. The first three give 
the y, x and z components of the curl and the last one gives the divergence 
conditions present in the evolution equation (B.3). 

It is to be noted that the matrices M are all non-singular and all are 
hermitian. Moreover, they satisfy the usual algebra of the Dirac matrices, 
including, 

M x (3 = -(3M X , 
M y (3 = -(3M y , 



Mi = M: 



M 



I, 



M X My = —MyM X 



MyM Z 



M z M y = i.U, 

-M..M 



iMy. 



(B.9) 



Before proceeding further we note the following: The pair (\l/ ± , M) are 
not unique. Different choices of ^ ± would give rise to different M, such 
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that the triplet M continues to to satisfy the algebra of the Dirac matrices 
in (B.9). We have preferred ^ via the the Riemann-Silberstein vector (B.2) 
in [23]. This vector has certain advantages over the other possible choices. 
The Riemann-Silberstein vector is well-known in classical electrodynamics 
and has certain interesting properties and uses [23]. 

In deriving the above 4x4 matrix representation of the Maxwell equations 
we have ignored the spatial and temporal derivatives of e(r,t) and /i(r,t) in 
the first two of the Maxwell equations. We have treated e and pi as 'local' 
constants. 



B.2 Inhomogeneous Medium 

In the previous section we wrote the evolution equations for the Riemann- 
Silberstein vector in (B.3), for a medium, treating e(r, t) and /x(r, t) as 'local' 
constants. From these pairs of equations we wrote the matrix form of the 
Maxwell equations. In this section we shall write the exact equations taking 
into account the spatial and temporal variations of e(r,t) and /i(r,t). It is 
very much possible to write the required evolution equations using e(r,t) and 
fi(r,t). But we shall follow the procedure in [23] of using the two derived 
laboratory functions 



Velocity Function :v(r,t) 



Resistance Function : h(r, t) 



1 



;(r,t)n(r,t) 



\ <r,t) 



(B.10) 



The function, v(r,t) has the dimensions of velocity and the function, h(r,t) 
has the dimensions of resistance (measured in Ohms). We can equivalently 
use the Conductance Function, n(r,t) = l/h(r,t) = e(r,t)//i(r,t) (measured 
in Ohms -1 or Mhos!) in place of the resistance function, h(r,t). These 
derived functions enable us to understand the dependence of the variations 
more transparently [23]. Moreover the derived functions are the ones which 
are measured experimentally. In terms of these functions, e = 1/y/vh and 
\i = yjh/v. Using these functions the exact equations satisfied by F* 1 (r, t) 
are 

d 1 
i-F + (r,t) = v(r,t)(VxF + (r,t)) + -(Vv(r,t)xF + (r,t)) 
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v(r,t) 



^ (Vh(r,t) x F~ (r,t)) - -~^v{r,t)h{r,t) J 



2h( 

iv(r,t),. , i h(r,t) „ . 
2u(r,t) V ; 2/i(r,f) V ; 



9 1 
i— F~ (r, f) = -u(r, t) (V x F" (r, *)) - - (Vu(r, *) x F (r, t 



v(r,t) 



^ (V/i(r, f) x F+ (r, f)) - ^=y/v(r,t)h(r,t) 



2h( 

; F- (r,t) + --f^-(F + (r,f) 



2u(r,t) v ' ' 2/i(r,f) 
V-F+(r,t) = (Vv(r,t) ■ F + (r,t)) 



2h(r,t) 
1 



+^\/v(r,t)h(r,t)p, 



V-F-(r,t) = ^-±^(y v (r,t).F-(r,t)) 



+~\lv{r,t)h{r,t)p, (B.ll) 

where v — || and h — The evolution equations in (B.ll) are exact (for 
a linear media) and the dependence on the variations of e(r, t) and /x(r, t) 
has been neatly expressed through the two derived functions. The coupling 
between F + (r,t) and F~ (r, £). is wa the gradient and time-derivative of 
only one derived function namely, h(r,t) or equivalently n(r,t). Either of 
these can be used and both are the directly measured quantities. We further 
note that the dependence of the coupling is logarithmic 

' -Vh(r, t) = V {In (h(r, t))} , -r^—,h(r, t) = ^ {In (h(r, t))}$.12) 



h(r,t) v ' ' h(r,t) v ' ' dt 

where 'In' is the natural logarithm. 

The coupling can be best summarized by expressing the equations in (B.ll) 
in a (block) matrix form. For this we introduce the following logarithmic 
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function 



C(r,t) = -{l\n(v(r,t)) + a x \n(h(r,t))} , 



(B.13) 



where a x is one the triplet of the Pauli matrices 





" 1 " 




" -i " 




' 1 


" 




0~x = 


1 


, <Ty = 


i 


, c 2 = 





-1 





(B.14) 



Using the above notation the matrix form of the equations in (B.ll) is 

= v(r)a z {IV + VC} x 



F 



(r,t) 
(r,t) 



F" 
F 



(r,t) 
(r,t) 



{IV- VC} 



F 
F 



(r,t) 
(r,t) 



-^v(r,t)h(r,t)J 



(B.15) 



where the dot-product and the cross-product are to be understood as 

A ■ u + B ■ v 
C u + D v 

A x u + B x v 
C x u + D x v 



A 


B 




u 




C 


D 




V 




A 


B 




u 




C 


D 


X 


V 





(B.16) 



It is to be noted that the 6x6 matrices in the evolution equations in (B.15) 
are either hermitian or antihermitian. Any dependence on the variations 
of e(r,t) and n(r,t) is at best 'weak'. We further note, V (In (v(r, t))) = 
— V (In (n(r, t))) and ^ (In (v(r, t))) = — ^ (In (n(r, t))). In some media, 
the coupling may vanish (Vh(r,t) = and h(r,t) = 0) and in the same 
medium the refractive index, n(r,t) = c/v(r,t) may vary (Vn(r,t) ^ 
or/and n(r,t) ^ 0). It may be further possible to use the approximations 
V (In (h(r, t))) w and f (In (h(r, *))) w 0. 

We shall be using the following matrices to express the exact representa- 



tion 



"(T 0" 




' a ' 




"10" 


(T 


a = 


cr 


, 1 = 


11 





(B.17) 
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where X are the Dirac spin matrices and a are the matrices used in the Dirac 
equation. Then, 



d_ 
di 



I 






I 







v(r,t) 


I 













2v(r,t) 





I 






h(r,t) 


ij3a y 






2h(r, 


t) 












= -v(r,t) 



I 
/ 



{M ■ V + S-w} 



— \j3 (S • it>) 
{M* ■ V + S* • u} 



(B.18) 



where 



u(r,t) 
w(r, t) 



1 



2v(r t ) Vv ^' t) = \ V { lnv ( r >W = -^V{lnn(r,t)} 



^VMr,t) = iv{lnMr,*)} 



(B.19) 



The above representation contains thirteen 8x8 matrices! Ten of these 
are hermitian. The exceptional ones are the ones that contain the three 
components of w(r,t), the logarithmic gradient of the resistance function. 
These three matrices, for the resistance function are antihermitian. 

We have been able to express the Maxwell equations in a matrix form in a 
medium with varying permittivity e(r, t) and permeability fi(r, t), in presence 
of sources. We have been able to do so using a single equation instead of a 
pair of matrix equations. We have used 8x8 matrices and have been able 
to separate the dependence of the coupling between the upper components 
(^ + ) and the lower components through the two laboratory functions. 

Moreover the exact matrix representation has an algebraic structure very 
similar to the Dirac equation. We feel that this representation would be more 
suitable for some of the studies related to the photon wave function 
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Appendix-C. 
Foldy-Wouthuysen Transformation 



In the traditional scheme the purpose of expanding the light optics Hamil- 



parameter is to understand the propagation of the quasiparaxial beam in 
terms of a series of approximations (paraxial + nonparaxial) . Similar is the 
situation in the case of the charged-particle optics. Let us recall that in rela- 
tivists quantum mechanics too one has a similar problem of understanding 
the relativistic wave equations as the nonrelativistic approximation plus the 
relativistic correction terms in the quasirelativistic regime. For the Dirac 
equation (which is first order in time) this is done most conveniently using 
the Foldy-Wouthuysen transformation leading to an iterative diagonalization 
technique. 

The main framework of the formalism of optics, used here (and in the 
charged-particle optics) is based on the transformation technique of the 
Foldy-Wouthuysen theory which casts the Dirac equation in a form dis- 
playing the different interaction terms between the Dirac particle and and 
an applied electromagnetic field in a nonrelativistic and easily interpretable 
form (see, [19], [36]- [38], for a general discussion of the role of the Foldy- 
Wouthuysen-type transformations in particle interpretation of relativistic 
wave equations). The suggestion to employ the Foldy-Wouthuysen Trans- 
formation technique in the case of the Helmholtz equation was mentioned 
in the literature as a remark [39]. It was only in the recent works, that 
this idea was exploited to analyze the quasiparaxial approximations for for 
specific beam optical system [3, 4]. The Foldy-Wouthuysen technique is ide- 
ally suited for the Lie algebraic approach to optics. With all these plus 
points, the powerful and ambiguity-free expansion, the Foldy-Wouthuysen 
Transformation is still little used in optics [40]. In the Foldy-Wouthuysen 
theory the Dirac equation is decoupled through a canonical transformation 
into two two-component equations: one reduces to the Pauli equation in the 
nonrelativistic limit and the other describes the negative-energy states. 

Let us describe here briefly the standard Foldy-Wouthuysen theory so that 
the way it has been adopted for the purposes of the above studies in optics 



tonian H = — 



(n 2 (r)-pl) 



1/2 . 




m a series usm; 
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will be clear. Let us consider a charged-particle of rest-mass mo, charge q in 
the presence of an electromagnetic field characterized by E = — V<^ 
and B = V x A. Then the Dirac equation is 



dt A 



iH^(r,t) 



H D = 
£ = 

6 



H D *(r,t) 

m c 2 {3 + qcp + col 
= m c 2 p + £ + O 
= <1<P 

= COL ■ 7T , 



(C.l) 



7T 



(C.2) 



where 



OL 





(T 



(T 




1 









1 







1 



" 


1 " 




' -i " 




' 1 


" 




1 





, <Ty = 


i 


, cr z = 





-1 





(C.3) 



with 77 = p - qA, p = —ihV, and tt 2 = + n 2 + nfj . 

In the nonrelativistic situation the upper pair of components of the Dirac 
Spinor ^ are large compared to the lower pair of components. The opera- 
tor £ which does not couple the large and small components of \& is called 
'even' and O is called an 'odd' operator which couples the large to the small 
components. Note that 



(C.4) 
CAP, such that 



PO = -OP , p£ = £p. 

Now, the search is for a unitary transformation, \&' = \& - 
the equation for \P' does not contain any odd operator. 

In the free particle case (with = and tt = p) such a Foldy-Wouthuysen 
transformation is given by 

* — > = u F m 



e iS = e? a -P d , tan2\p\9 



\P\ 

TXIqC 



(C.5) 



This transformation eliminates the odd part completely from the free particle 
Dirac Hamiltonian reducing it to the diagonal form: 

d 



ih—V' = e [S (m c 2 p + ca-p)e- iS ty' 
at 
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cos \p\6 + — sin (m c 2 [3 + col ■ pj 

x ^cos \p~\6 - sin \p\6^j 

m c 2 cos 2|p|0 + c|p| sin 2|p|#) f3^' 



(C.6) 



In the general case, when the electron is in a time-dependent electromag- 
netic field it is not possible to construct an exp(iS') which removes the odd 
operators from the transformed Hamiltonian completely. Therefore, one has 
to be content with a nonrelativistic expansion of the transformed Hamilto- 
nian in a power series in l/m c 2 keeping through any desired order. Note 
that in the nonrelativistic case, when \p\ <C m c, the transformation oper- 
ator U F = exp(iS') with S « — i(3O/2m c 2 , where O = col ■ p is the odd 
part of the free Hamiltonian. So, in the general case we can start with the 
transformation 



^r(l) = e iSi^ r> = 



i(3Q 
2m c 2 



iftOL ■ 7T 

2m c 



(C.7) 



Then, the equation for is 



i?4* (i) 

dt 



d 



dt 



d 



dt 



iSi 



d 



dt 



in 



dt 
d 



+ e i5l # 



D 



-iSi 



-iSi 



9* 

e^^ D e-^-i^v( e - i51 



(C.8) 



where we have used the identity (e A ) e" 
Now, using the identities 



^ c dt 



(<-*) 



at 1 



0. 



e 2 Be~ 2 = B+[A,B] + ^[A,[A,B]] + ^[A,[A,[A,B]]] + ... 



39 



dt 

xi(i-A(t) + ±A(t) 2 -U(t) 



dt 



2Y 



3! 



= ( 1 + Mt) + ^Mtf + ^Mtf 



x 



dA(t) t 1 f gAg) 



9* 



A(t) + 



dA(t) \ 

at j 



1 f^V+^w^o 



3! [ 

+A(f) 

<9i(t) 



dt 

2 dA(t) \ N 



9* 
1 

"3! 
1 

~4! 



9t J 
1 
2! 



l(t), 



dA(t)' 
dt 

A(t), 



dA(t) 
dt 



(C.9) 



with A = iSi, we find 



H 



(i) 



dt 



Si,H D 



hdSi 
2~dt 



1 
2! 
i 

"3! 



Si, 
Si, 



Si, 



Si > Hd ~TW 



(CIO) 



Substituting in (CIO), = m c 2 /3 + £ + 0, simplifying the right hand 
side using the relations (30 = —Of3 and f3£ = £f3 and collecting everything 
together, we have 



H 



(i) 

D 



m c 2 {3 + E x + O x 
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£ + 



1 pd 2 - 1 



2moc 2 
1 

8mjjc 6 



2„4 



(30 4 



P 



2m c 2 



~dt 



6,({6,E] + m a £ 



o 3 , 



3m 2 c A 



(C.11) 



with ^ and Oi obeying the relations POi = —0\j5 and /3£i = E\{5 exactly 
like £ and O. It is seen that while the term O in Hp is of order zero with 
respect to the expansion parameter l/m c 2 (i.e., O = O ((l/m c 2 )°) the 

odd part of H$ , namely 0\, contains only terms of order l/m c 2 and higher 
powers of l/moc 2 (i.e., 0± — O ((l/moc 2 ))). 

To reduce the strength of the odd terms further in the transformed Hamil- 
tonian a second Foldy-Wouthuysen transformation is applied with the same 
prescription: 

^(2) = e i&^(i) > 

s 2 = ipdl 

2m c 2 



2m c 2 

After this transformation, 



P 



2m c 2 



dt J 3m 2 c 4 



(C.12) 



dt 

£ 2 wide 



H (2) =m c 2 (3 + £ 2 + 6 2 



£i, 2 



P 



2m c 2 



where, now, 2 = O ((l/m c 2 ) 2 ). After the third transformation 



^(3) = ^3^(2)^ 



i(3Q 2 
2m c 2 



(C.13) 



(C.14) 



we have 



dt 



H {3) ¥ 3 ^ , H {3) = m c 2 [3 + £ 3 + d 3 



£ 2 « £i , 3 



P 

2m c 2 



2 ,£ 2 +m 



dQ 2 
dt 



(C.15) 
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where O3 = O ((l/m c 2 ) 3 ). So, neglecting O3, 



m c 2 p + £ + —^[3d 2 




ox 



dcy 

~dt 



(C.16) 



It may be noted that starting with the second transformation successive 
(£, O) pairs can be obtained recursively using the rule 



61 (£ 



£3-1,0- 
> £3-1,6 



63-1) 



j>i, 



(C.17) 



and retaining only the relevant terms of desired order at each step. 

With £ = qcf) and O = col • 7r, the final reduced Hamiltonian (C.16) is, to 
the order calculated, 



H 



(3) 
D 



(3 yn c 2 + 

iqh 2 

Smlc 2 



■~4 



2m 8m[jc 6 



S • curl E - 



+ q<f>- 



qh 
2m c 

q 4- 2 V-Exp 



qh 2 



8mnC 2 



dlvE, 



(C.18) 



with the individual terms having direct physical interpretations. The terms 



in the first parenthesis result from the expansion of ym^c 4 + c 2 7f 2 showing 
the effect of the relativistic mass increase. The second and third terms are 
the electrostatic and magnetic dipole energies. The next two terms, taken 
together (for hermiticity), contain the spin-orbit interaction. The last term, 
the so-called Darwin term, is attributed to the zitterbewegung (trembling 
motion) of the Dirac particle: because of the rapid coordinate fluctuations 
over distances of the order of the Compton wavelength {2ti% /ijiqc) the particle 
sees a somewhat smeared out electric potential. 

It is clear that the Foldy-Wouthuysen transformation technique expands 
the Dirac Hamiltonian as a power series in the parameter Xjm^c 2 enabling the 
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use of a systematic approximation procedure for studying the deviations from 
the nonrelativistic situation. We note the analogy between the nonrelativistic 
particle dynamics and paraxial optics: 



The Analogy 



Standard Dirac Equation 



Beam Optical Form 



m c 2 (3 + £ D + 6 D 
m Q c 2 

Positive Energy 
Nonrelativistic, <C m c 
Non relativistic Motion 



Forward Propagation 
Paraxial Beam, |p_J Cn 
Paraxial Behavior 



-n p + S + 
-no 



+ Relativistic Corrections 



+ Aberration Corrections 



Noting the above analogy, the idea of Foldy-Wouthuysen form of the Dirac 
theory has been adopted to study the paraxial optics and deviations from it 
by first casting the Maxwell equations in a spinor form resembling exactly the 
Dirac equation (C.l, C.2) in all respects: i.e., a multicomponent ^ having 
the upper half of its components large compared to the lower components 
and the Hamiltonian having an even part (£), an odd part (O), a suitable 
expansion parameter, (\p±\/n <C 1) characterizing the dominant forward 
propagation and a leading term with a f3 coefficient commuting with £ and 
anticommuting with O. The additional feature of our formalism is to return 
finally to the original representation after making an extra approximation, 
dropping /3 from the final reduced optical Hamiltonian, taking into account 
the fact that we are primarily interested only in the forward-propagating 
beam. 
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Appendix-D 
The Magnus Formula 



The Magnus formula is the continuous analogue of the famous Baker- 
Campbell-Hausdorff (BCH) formula 

eV 1 = e A +BH[AA+^{U,A],B]+[[A,B],B]}+... _ ( D -Q 

Let it be required to solve the differential equation 

J^(t) = A(t)u(t) (D.2) 

to get u(T) at T > to, given the value of u(t ); the operator A can represent 
any linear operation. For an infinitesimal At, we can write 

u(t + At) = e AtA(to) u(t ). (D.3) 

Iterating this solution we have 

u(t + 2At) = e AtA(to+At ^ AtA ^u(t ) 

U(t + 3At) = e ^Mto+2At) e AtA(t 0+ At) e AtA( k> ) u ^ 

and so on. (D.4) 

If T = t + NAt we would have 



u 



(T) = ( l[ e AtA ^ t0+nA A u(t ) . (D.5) 

I n=0 J 



Thus, u(T) is given by computing the product in (D.5) using successively 
the BCH-formula (D.l) and considering the limit At — > 0,N — > oo such 
that iVAt = T — t . The resulting expression is the Magnus formula (Mag- 
nus, [29]) : 



u{T) = f(T,t )u(t ) 
T(T,t ) = exp Ij^dti A(ti) 
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+ 



\ f dt 2 [ dh A(t2),A(t!) 
Zi J to J to 



ft-2 



'to 

l r T 



t 3 rt 2 
to J to 

A(t!),i(t 2 )l ,A(t 3 ) 



+- dt 3 dt 2 dh (\\A(t 3 ),A(t 2 j\ ,i(*i) 

J t J t J t VLL J 

+ ..\. 



(D.6) 



To see how the equation (D.6) is obtained let us substitute the assumed 
form of the solution, u(t) = T (t , t ) u (t Q ) , in (D.2). Then, it is seen that 
T(t, to) obeys the equation 



d_ 

dt 



f (t, t ) = A(t)T (t, to), f (t , t ) = I. 



Introducing an iteration parameter A in (D.7), let 
d 



dt 



T(M ;A) = XA(t)T(t,t ; A) , 



T(to,t ;X) = /, T(t,t ;l) = T(t,t ). 
Assume a solution of (A8) to be of the form 

t(M ;A) = e n(t ' t0 ' x) 

with 

oo 

Q(t, to; A) = A n A n (t, t ), A n (t , t ) = for all n . 

n=l 

Now, using the identity (se Wilcox, [41]) 



9 Q{ t,t ;X) = | f'dse^^^-n^to-A) 
dt { Jo at 



e s n(t,t ;A) I e n(t,X) 



one has 



(D.7) 

(D.8) 
(D.9) 

(D.10) 
(D.11) 



(D.12) 



jf 1 dse sn ^ t0 ' x) —Q(t, t ; \) e - sn ^ to ^ = \A(t) . (D.13) 

Substituting in (D.13) the series expression for Q(t, to; A) (D.ll), expanding 
the left hand side using the first identity in (D.8), integrating and equating 
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the coefficients of A J on both sides, we get, recursively, the equations for 
Ai(f,to), A 2 (t,t ),... , etc. For j = 1 



d_ 
dt 



Ai(t,t ) = A(t), A 1 (t ,t )=0 



and hence 

For j = 2 
d 



Ai(Mo) = f dhA{t 

Jto 



(D.14) 
(D.15) 



= 0, A 2 (t ,to) = (D.16) 



and hence 

Similarly, 

A 3 (Mo) 



A 2 (Mo) = \ f dt 2 f 2 dh \A(t 2 ) , A(t 

Z JtQ J t{) 



;d.i7) 



I/' 

6 Jto 



to J t() 



+ 



cits {[[i(ti), i(t 2 )] , i(* 3 ) 
i(t 3 ),i(t 2 )] • 



(D.18) 



Then, the Magnus formula in (D.6) follows from (D.9)-(D.ll). Equation (38) 
we have, in the context of z-evolution follows from the above discussion with 
the identification t — ► z, t — ► z^\ T — > z^ and A(t) — > — j-H (z). 

For more details on the exponential solutions of linear differential equa- 
tions, related operator techniques and applications to physical problems the 
reader is referred to Wilcox [41], Bellman and Vasudevan [42], Dattoli et 
al. [43], and references therein. 
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Appendix-E 
Analogies between light optics and 
charged-particle optics: Recent Developments 

Historically, variational principles have played a fundamental role in the 
evolution of mathematical models in classical physics, and many equations 
can be derived by using them. Here the relevant examples are Fermat's prin- 
ciple in optics and Maupertuis' principle in mechanics. The beginning of the 
analogy between geometrical optics and mechanics is usually attributed to 
Descartes (1637), but actually it can traced back to Ibn Al-Haitham Alhazen 
(0965-1037) [44]. The analogy between the trajectory of material particles in 
potential fields and the path of light rays in media with continuously variable 
refractive index was formalized by Hamilton in 1833. This Hamiltonian anal- 
ogy lead to the development of electron optics in 1920s, when Busch derived 
the focusing action and a lens-like action of the axially symmetric magnetic 
field using the methodology of geometrical optics. Around the same time 
Louis de Broglie associated his now famous wavelength to moving particles. 
Schrodinger extended the analogy by passing from geometrical optics to wave 
optics through his wave equation incorporating the de Broglie wavelength. 
This analogy played a fundamental role in the early development of quan- 
tum mechanics. The analogy, on the other hand, lead to the development 
of practical electron optics and one of the early inventions was the electron 
microscope by Ernst Ruska. A detailed account of Hamilton's analogy is 
available in [45] -[47]. 

Until very recently, it was possible to see this analogy only between the 
geometrical-optic and classical prescriptions of electron optics. The reasons 
being that, the quantum theories of charged-particle beam optics have been 
under development only for about a decade [5]- [13] with the very expected 
feature of wavelength-dependent effects, which have no analogue in the tra- 
ditional descriptions of light beam optics. With the current development 
of the non-traditional prescriptions of Helmholtz optics [3, 4] and the ma- 
trix formulation of Maxwell optics, accompanied with wavelength-dependent 
effects, it is seen that the analogy between the two systems persists. The 
non-traditional prescription of Helmholtz optics is in close analogy with the 
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quantum theory of charged-particle beam optics based on the Klein-Gordon 
equation. The matrix formulation of Maxwell optics is in close analogy with 
the quantum theory of charged-particle beam optics based on the Dirac equa- 
tion. This analogy is summarized in the table of Hamiltonians. In this short 
note it is difficult to present the derivation of the various Hamiltonians which 
are available in the references. We shall briefly consider an outline of the 
quantum prescriptions and the non-traditional prescriptions respectively. A 
complete coverage to the new field of Quantum Aspects of Beam Physics 
(QABP), can be found in the proceedings of the series of meetings under 
the same name [48]. 

E.l Quantum Formalism of Charged-Particle Beam Op- 
tics 

The classical treatment of charged-particle beam optics has been extremely 
successful in the designing and working of numerous optical devices, from 
electron microscopes to very large particle accelerators. It is natural, however 
to look for a prescription based on the quantum theory, since any physical 
system is quantum mechanical at the fundamental level! Such a prescription 
is sure to explain the grand success of the classical theories. It is sure to help 
get a deeper understanding and lead to better designing of charged-particle 
beam devices. 

The starting point of the quantum prescription of charged particle beam 
optics is to build a theory based on the basic equations of quantum mechanics 
(Schrodinger, Klein-Gordon, Dirac) appropriate to the situation under study. 
In order to analyze the evolution of the beam parameters of the various 
individual beam optical elements (quadrupoles, bending magnets, • • •) along 
the optic axis of the system, the first step is to start with the basic time- 
dependent equations of quantum mechanics and then obtain an equation of 
the form 

d — 
[h ds^ (yX ' m ^ = H (yX ' m ^ ^ ( X ' m ^ ' 
where (x, y; s) constitute a curvilinear coordinate system, adapted to the 
geometry of the system. Eq. (E.l) is the basic equation in the quantum 
formalism, called as the beam-optical equation; Ti and tp as the beam-optical 
Hamiltonian and the beam wavefunction respectively. The second step re- 
quires obtaining a relationship between any relevant observable {(0)(s)} at 
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the transverse-plane at s and the observable {(0)(s- m )} at the transverse 
plane at Sj n , where Si n is some input reference point. This is achieved by the 
integration of the beam-optical equation in (E.l) 



The two-step algorithm stated above gives an over-simplified picture of 
the quantum formalism. There are several crucial points to be noted. The 
first step in the algorithm of obtaining the beam-optical equation is not to be 
treated as a mere transformation which eliminates t in preference to a variable 
s along the optic axis. A clever set of transforms are required which not only 
eliminate the variable t in preference to s but also give us the s-dependent 
equation which has a close physical and mathematical correspondence with 
the original t-dependent equation of standard time-dependent quantum me- 
chanics. The imposition of this stringent requirement on the construction 
of the beam-optical equation ensures the execution of the second-step of the 
algorithm. The beam-optical equation is such that all the required rich ma- 
chinery of quantum mechanics becomes applicable to the computation of 
the transfer maps that characterize the optical system. This describes the 
essential scheme of obtaining the quantum formalism. The rest is mostly 
mathematical detail which is inbuilt in the powerful algebraic machinery 
of the algorithm, accompanied with some reasonable assumptions and ap- 
proximations dictated by the physical considerations. The nature of these 
approximations can be best summarized in the optical terminology as a sys- 
tematic procedure of expanding the beam optical Hamiltonian in a power 
series of \7t±/po\, where p is the design (or average) momentum of beam 
particles moving predominantly along the direction of the optic axis and 7r± 
is the small transverse kinetic momentum. The leading order approxima- 
tion along with |7T_i_/po| *C 1, constitutes the paraxial or ideal behaviour and 
higher order terms in the expansion give rise to the nonlinear or aberrating 
behaviour. It is seen that the paraxial and aberrating behaviour get modified 
by the quantum contributions which are in powers of the de Broglie wave- 
length (~X = h/po)- The classical limit of the quantum formalism reproduces 
the well known Lie algebraic formalism of charged-particle beam optics [49]. 




(E.2) 




{ip(x,y;s) \0\if>(x,y;s)) , 
(i/j(x,y;s in ) U ] OU ip(x,y;s- 



)> • (E.3) 
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E.2 Light Optics: Various Prescriptions 

The traditional scalar wave theory of optics (including aberrations to all or- 
ders) is based on the beam-optical Hamiltonian derived by using Fermat's 
principle. This approach is purely geometrical and works adequately in 
the scalar regime. The other approach is based on the square-root of the 
Helmholtz operator, which is derived from the Maxwell equations [49]. This 
approach works to all orders and the resulting expansion is no different from 
the one obtained using the geometrical approach of Fermat's principle. As for 
the polarization: a systematic procedure for the passage from scalar to vector 
wave optics to handle paraxial beam propagation problems, completely tak- 
ing into account the way in which the Maxwell equations couple the spatial 
variation and polarization of light waves, has been formulated by analyzing 
the basic Poincare invariance of the system, and this procedure has been 
successfully used to clarify several issues in Maxwell optics [14]-[17]. 

In the above approaches, the beam-optics and the polarization are studied 
separately, using very different machineries. The derivation of the Helmholtz 
equation from the Maxwell equations is an approximation as one neglects the 
spatial and temporal derivatives of the permittivity and permeability of the 
medium. Any prescription based on the Helmholtz equation is bound to be an 
approximation, irrespective of how good it may be in certain situations. It is 
very natural to look for a prescription based fully on the Maxwell equations, 
which is sure to provide a deeper understanding of beam-optics and light 
polarization in a unified manner. 

The two-step algorithm used in the construction of the quantum theories 
of charged-particle beam optics is very much applicable in light optics! But 
there are some very significant conceptual differences to be borne in mind. 
When going beyond Fermat's principle the whole of optics is completely 
governed by the Maxwell equations, and there are no other equations, unlike 
in quantum mechanics, where there are separate equations for, spin-1/2, 
spin-1, • • •. 

Maxwell's equations are linear (in time and space derivatives) but cou- 
pled in the fields. The decoupling leads to the Helmholtz equation which is 
quadratic in derivatives. In the specific context of beam optics, purely from 
a calculational point of view, the starting equations are the Helmholtz equa- 
tion governing scalar optics and for a more accurate prescription one uses 
the full set of Maxwell equations, leading to vector optics. In the context of 
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the two-step algorithm, the Helmholtz equation and the Maxwell equations 
in a matrix representation can be treated as the 'basic' equations, analogue 
of the basic equations of quantum mechanics. This works perfectly fine from 
a calculational point of view in the scheme of the algorithm we have. 

Exploiting the similarity between the Helmholtz wave equation and the 
Klein-Gordon equation, the former is linearized using the Feshbach-Villars 
procedure used for the linearization of the Klein-Gordon equation. Then the 
Foldy-Wouthuysen iterative diagonalization technique is applied to obtain 
a Hamiltonian description for a system with varying refractive index. This 
technique is an alternative to the conventional method of series expansion of 
the radical. Besides reproducing all the traditional quasiparaxial terms, this 
method leads to additional terms, which are dependent on the wavelength, 
in the optical Hamiltonian. This is the non-traditional prescription of scalar 
optics. 

The Maxwell equations can be cast into an exact matrix form taking into 
account the spatial and temporal variations of the permittivity and perme- 
ability. The derived representation using 8x8 matrices has a close algebraic 
analogy with the Dirac equation, enabling the use of the rich machinery 
of the Dirac electron theory. The beam optical Hamiltonian derived from 
this representation reproduces the Hamiltonians obtained in the traditional 
prescription along with wavelength-dependent matrix terms, which we have 
named as the polarization terms. These polarization terms are very similar to 
the spin terms in the Dirac electron theory and the spin-precession terms in 
the beam-optical version of the Thomas-BMT equation [10] . The matrix for- 
mulation provides a unified treatment of beam optics and light polarization. 
Some well known results of light polarization are obtained as the paraxial 
limit of the matrix formulation [14]-[17]. The traditional beam optics is com- 
pletely obtained from our approach in the limit of small wavelength, A — ► 0, 
which we call as the traditional limit of our formalisms. This is analogous to 
the classical limit obtained by taking h — > 0, in the quantum prescriptions. 

From the Hamiltonians in the Table we make the following observations: 
The classical/traditional Hamiltonians of particle/light optics are modified 
by wavelength-dependent contributions in the quantum/non-traditional pre- 
scriptions respectively. The algebraic forms of these modifications in each 
row is very similar. This should not come as a big surprise. The starting 
equations have one-to-one algebraic correspondence: Helmholtz <-> Klein- 
Gordon; Matrix form of Maxwell <-> Dirac equation. Lastly, the de Broglie 
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wavelength, Ao, and A have an analogous status, and the classical/traditional 
limit is obtained by taking A — ► and A — > respectively. The parallel 
of the analogies between the two systems is sure to provide us with more 
insights. 



Appendix-F. 
An Invitation to the Experimentalists 

It would be worthwhile to experimentally look for the predicted image 
rotation and the wavelength-dependent modifications of the aberration coef- 
ficients. 
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Table A. 

Hamiltonians in Different Prescriptions 

The following are the Hamiltonians, in the different prescriptions of light beam op- 
tics and charged-particle beam optics for magnetic systems. Hq , p are the paraxial 
Hamiltonians, with lowest order wavelength-dependent contributions. 



Light Beam Optics 


Charged-Particle Beam Optics 


Fer mat's Principle 

H = - {n 2 (r) - p\} 1/2 


Maupertuis' Principle 

n = -{pl-^\} l/2 -qA z 


Non- Traditional Helmholtz 


Klein- Gordon Formalism 


H 0)P = 


H 0,p = 

-po- qA z + 


Maxwell, Matrix 


Dirac Formalism 


H , P = 

- i*/?E • u 
+ A* 2 ^ 


H 0,P = 
-PO- qA z + 

- ^ { M7 S ± • B ± + (q + fj.) Z Z B Z } 
+ i-*-eB g 


Notation 

Refractive Index, n(r) = c\/e(r)fi(r) 
Resistance, h(r) = y/fi(r)/e(r) 

XI and (3 are the Dirac matrices. 


n± = p±- qA± 

/j, a anomalous magnetic moment. 
e a anomalous electric moment. 
H = 2m fi a /h , e = 2m e a /h 
7 = E/rriQC 2 
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